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Abstract 

We consider optimal control problems for diffusion processes, where the objective func¬ 
tional is defined by a time-consistent dynamic risk measure. We focus on coherent risk mea¬ 
sures defined by p-evaluations. For such problems, we construct a family of time and space 
perturbed systems with piecewise-constant control functions. We obtain a regularized optimal 
value function by a special mollification procedure. This allows us to establish a bound on the 
difference between the optimal value functions of the original problem and of the problem with 
piecewise-constant controls. 


1 Introduction 

The first introduction of a coherent (static) risk measure, by Artzner et al. |l2l[3l, was motivated by 
the capital adequacy rules of the Basel Accord. Large volume of research were devoted to this area, 
Fdllmer, Schied ifTTl and Frittelli and Rosazza Gianin |f20]| generalized it to convex risk measure, 
Ruszczyiiski and Shapiro iHTI studied it from the perspective of optimization. Several classical 
references concerning static risk measures are |[T8l[T9ll^l42ll . 

Further development of the theory of risk measures lead to a dynamic setting, in which the 
risk is measured at each time instance based on the updated information. The key condition of 
time-consistency allows for dynamic programming formulations. The discrete time case was exten¬ 
sively explored by Detlfsen and Scandolo IfTSlI . Bion-Nadal @, Cheridito et al. IHITOl, Fdllmer 
and Penner lfThl . Frittelli and Scandolo 1211 . Riedel ll37l . and Ruszczyhski and Shapiro iHBl . For 
the continuous-time case. Coquet, Hu, Memin and Peng ifTll discovered that time-consistent dy¬ 
namic risk measures can be represented as solutions of Backward Stochastic Differential Equations 
(BSDE) (see also |[3^[38ll l. Inspired by that, Barrieu and El Karoui provided a comprehensive study 
in 111151; further contributions being made by Delbean, Peng, and Rosazza Gianin |[T2l . and Quenez 
and Sulem 1361 (for a more general model with Eevy processes). In addition, application to finance 
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was considered, for example, in flM . Using the convergence results of Briand, Delyon and Memin 
[^, Stadje Il44l finds the drivers of BSDE corresponding to discrete-time risk measures. 

As for control with risk aversion, in discrete time setting, Ruszczyhski |[39l . ^avu§ and Ruszczyhski 
||8l and Fan and Ruszczyhski llT4l developed the concept of a Markov risk measure and proposed 
risk-averse dynamic programming equations as well as computation methods. Our intention is to 
use continuous-time dynamic risk measures as objective functionals in optimal control problems 
for diffusion processes. While the traditional continuous stochastic control is well developed and 
discussed in numerous books (see, e.g., mSl 1221351 STD, the risk-averse case appears to be largely 
unexplored. In the present paper, we consider the risk-averse case with coherent risk measures 
given by p-evaluations. Such control problems are closely related to forward-backward systems of 
stochastic differential equations (FBSDE) (see, Il2^l3^ 1. For controlled fully coupled FBSDEs, Fi 
and Wei |[25l obtained the dynamic programming equation and derived the corresponding Hamilton- 
Jacobi-Bellman equation. Maximum principle for forward-backward systems and corresponding 
games was derived in Il27ll28]l . including models with Fevy processes. 

The contribution of this paper is the study of accuracy of discrete-time approximations of risk- 
averse optimal control problems with coherent risk measures given by ^r-evaluations. For the pur¬ 
pose of the study, we construct a family of perturbed systems with two types of perturbations: of the 
initial time and the initial state. For such a family, we integrate the value functions of a piecewise- 
constant control with respect to the said initial time and state values. This yields regularized func¬ 
tions for which ltd calculus can be applied. Using the earlier results on the Hamilton-Jacobi- 
Bellman equation for risk-averse problems, we establish an error bound of order between the 
optimal values of the original system and a system with piecewise constant controls with time step 
A. 

Section ^has a synthetic character. We review in it the concept of F-consistent evaluations and 
the connections to backward stochastic differential equations and dynamic risk measures. In ® we 
formulate the risk-averse optimal control problem and study its basic properties. In the meanwhile, 
we recall the dynamic programming equation and the risk-averse analog of the Hamilton-Jacobi- 
Bellman equation. In section ®we construct a family of time and space perturbed problems. They 
are used in a specially designed mollification procedure in ^ which yields sufficiently smooth close 
approximations of the optimal value function. In ^ we prove that the accuracy of the control poli¬ 
cies restricted to piecewise-constant controls is of the order where is the time discretization 
step. 

2 Foundations 

2.1 Nonlinear Expectations and Dynamic Risk Measures 

We establish a suitable framework and briefly review fhe concepf of F-consisfenf nonlinear expec- 
fafions (for an exfensive freafmenf, see 122). For 0 < T < oo, lef (17, P, F) be a probabilify 
space, where F = {7^t}o<t<T ^ filtration. A vector-valued stochastic process {Xt}o<t<T is said 

to be adapted to F if Xt is an Jj-measurable random variable for any t G [0, T]. 

We introduce the following notation. 
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. Et[-] :=E[-|Ji]; 

• P™[i, T]: the set of IR"^-valued adapted processes on [t, T] x 

• Jt, P; R™): the set of ]R"^-valued -measurable random variables ^ such that ||^|p := 

E[ ] < oo; for m = 1, we write it L^(f?, T, P); 

• T]: the set of elements y G P™[f, T] such that ||y||^ 2 ,m[^ 2 ^] := P[supj<g< 2 ^ ly^p ] < 
oo; for m = 1, we write it S^[f, T]; 

• the set of elements Y G P''"[f,r], such that ||y||u 2 ,mr. := E 

H ’ [t,J ] 

oo; for m = 1 we write it [f, T] u 

• T] X R"*) the space of functions / : [t, T] x R™ —)■ R, which are differentiable with 
respect to the first argument and twice differentiable with respect to the second argument, 
with all these derivatives continuous with respect to both arguments; 

• ^ space of functions / G T] x R"^) with all derivatives bounded 

and continuous with respect to both arguments; 

• the space of functions / : R —)■ M that are infinitely continuously differentiable with 
respect to all arguments and have compact support on R C M”. 

With this notation, we can introduce the concept of a nonlinear expectation. 

Definition 2.1. For 0 < T < oo, a nonlinear expectation is afunctional po,T ^ J^t-, P) —>■ R 

satisfying the strict monotonicity property: 

ifii > 6 a-s-, then po,r[6 ] > Po,r[6 ]; 

if > 6 a.s., then po,r[6 ] = Po,r[6 ] if and only if = ^2 a.s.] 

and the constant preservation property: 

Po,t[c1j7 ] = c, V c G R, 

where 1 a i^ the characteristic function of the event A G J-t- 

Based on that, the F-consistent nonlinear expectation is defined as follows. 

Definition 2.2. For a filtered probability space (f2, R, P, E), a nonlinear expectation /5o,r[ • ] R E- 
consisfenf if for every ^ G L^(l?, Rr, P) and every t G [0,r] a random variable p G L^(f7, J-t,P) 
exists such that 

Po,t[?1a] = Po,r[f?lA] yAGTt. 

The variable p in Definition 12.21 is uniquely defined, we denote it by pt,r[^]- It can be inter¬ 
preted as a nonlinear conditional expectation of ^ at time t. We can now define for every t G [0, T] 
the corresponding nonlinear expectation po,t : -Pi,P) —^ R as follows: po,t[?] = Po,r[C]> 

for all ^ G L^(l7,Ri,P). In this way, a whole system of F-consistent nonlinear expectations 
defined. 

'when the norm is clear from the context, the subscripts are skipped. 
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Proposition 2.3. If Pq,t\ ■ ] W-consistent nonlinear expectation, then for all 0 <t <T and all 

G L?{Q, it has the following properties: 

(i) Generalized constant preservation: If^ € Lp{f2, Ft,F), then Pt,t[^] = 

(ii) Time consistency: /9s,t[?] = Ps,t[pt,T[^ ] ], for all 0 < s <t; 

(iii) Local property: A + i' aA = IaPptI^] + IaFptI^'], for all A G Ft. 

It follows that F-consistent nonlinear expectations are special cases of dynamic time-consistent 
measures of risk, enjoying a number of useful properties. They do not, however, have the properties 
of convexity, translation invariance, or positive homogeneity, unless additional assumptions are 
made. We shall return to this issue in the next subsection. 

2.2 Backward Stochastic Differential Equations and ^^-Evaluations 

Close relation exists between F-consistent nonlinear expectations on the space L^(f2, F,F), with 
the natural filtration of the Brownian motion, and backward stochastic differential equations (BSDE) 
ll^|30l[32l. We equip (17, F, F) with a d-dimensional Brownian filtration, i.e.. Ft = a{{Ws', 0 < 
s < T} U A^}, where Af is the collection of P-null sets in 17. In this paper we consider the following 
one-dimensional BSDE: 

- dYt = g{t, Yt, Zt) dt - Zt dWi, Yt = (1) 

where the data is the pair called the terminal condition and the generator (or driver), respec¬ 

tively. Here, ^ G L^(17, Ft, F), and p : [0, T] x R x x 17 —> R is a measurable function (with 
respect to the product ci-algebra), which is nonanticipative, that is, g{t, Yt, Zt) is At-measurable for 
alH G [0,r]. 

The solution of the BSDE is a pair of processes (Y, Z) G S^[0, T] x T] such that 

Yt = C+ r g{s, Ys, Zs) ds - r Zs dH4, t G [0, T]. (2) 

Jt Jt 

The existence and uniqueness of the solution of ([Til can be guaranteed under the following assump¬ 
tion. 

Assumption 2.4 (Peng and Pardoux |[29l ). (i) g is jointly Lipschitz in [y, z), i.e., a constant K > 

0 exists such that for all t G [0, T], all yi, 1/2 G R and all zi,Z 2 G R'^ we have 

\g{t,yi,zi) - g{t,y 2 ,Z 2 )\ < K{\yi - y 2 \ + | 2 ;i - Z 2 I) a.s.; 

(ii) the process g{-, 0,0) G H^[0, T]. 

Under Assumption 12.41 we can define fhe y-evaluafion. 

Definition 2.5. For each 0 < t <T and ^ G L?{T}, Ft, F), the g-evaluafion at time t is the operator 
pIj. : L?{Q,Ft,F) -a L^{Q,Ft,F) defined as follows: 

plT[i\=Yt, (3) 

where {Y, Z) G T] x T] is the unique solution of ([T]l. 
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The following theorem reveals the relationship between p-evaluation and F-consistent nonlinear 
expectation. 

Theorem 2.6. Let the driver g satisfy Assumption \2.4\ and the condition: gf(-, 0,0) = 0 a.^.. Then 
the system of g-evaluations (pf'r)o<t<T ^ system of ¥-consistent nonlinear expec¬ 

tations. Furthermore, we have 

limpltU] = C, VC G L\Q,Tt,¥), t G [0,r]. 

Surprisingly, Coquet, Hu, Memin, and Peng proved in IfTTl that every F-consistent nonlin¬ 
ear expectation which is “dominated” by (a p-evaluation with the driver p\y\ i'\z\ with 
some I/, p > 0) is in fact a p-evaluation with some g. The domination is understood as follows: 
Po,t[Y + v]- Po,t[Y] < for all Y,ge Yt, P). 

From now on we shall use only p-evaluations as time-consistent dynamic measures of risk. To 
ensure desirable properties of the resulting measures of risk, we shall impose additional conditions 
on the driver g. 

Assumption 2.7. The driver g satisfies for almost all t & [0)2^] the following conditions: 

(i) g is deterministic and independent ofy, that is, g : [0, T] x —> R, and g(-,0) = 0; 

(ii) g(t, •) is convex for all t G [0, T]; 

(hi) g(t,-) is positively homogeneous for all t G [0, T]. 

Under these conditions, one can derive new properties of the evaluations t G [0,T], in 
addition to the general properties of F-consistent nonlinear expectations stated in Proposition 12. 3 1 

Theorem 2.8. Suppose g satisfies Assumption 12.41 and condition (i) of Assumption IZTl Then the 
system of g-evaluations Pi^,D<t<r<T has the following properties: 

(i) Normalization: pL(o) = 0.’ 

(ii) Translation Property: for all ^ G L^(l7, J>, P) and g G L^(l7, J), P), 

plM + h) = PpriO+ V, a.s.] 

If, additionally, condition (ii) of of Assumption \2. 7\ is satisfied, then pf^ has the following property: 
(hi) Convexity: for all G L^(f2, J>, P) and all A G , P) such that 0 < A < 1, 

PtA^^ + (1 - AO < + (1 - APt,riO: ^.S.. 

Moreover, if g also satisfies condition (Hi) of Assumption IZZl then pA has also the following prop¬ 
erty: 

(iv) Positive Homogeneity: for all C G J>, P) and all /3 G L°°(f2, (Ft, P) ^uch that (3 > 0, 
we have 

pIvAA = I^PtAA, a.s.. 

It follows that under Assumptions 12.41 and 12.71 the p-evaluations pA are convex or coherent 
conditional measures of risk (depending on whether (iii) is assumed or not). 

Finally, we can derive their dual representation, by specializing the general results of Q. 
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Theorem 2.9. Suppose g satisfies Assumption 12.41 and 12.71 Then for all 0 < t < r < T and all 
^ €L^(i7,J>,P) we have 

pIM= sup E[n\j^t] (4) 

r&At,r 

where At,r = dpf^{0) is defined as follows: 


At,r = exp ( / 'jsdWs - 2 j \ls?ds ) : 7 € 7^ G dg{s,Q), s € [t,r] 


(5) 


Corollary 2.10. A constant C exists, such that for all 0 <t < r <T and all T^j. G At^r we have 


rt,r — 1 || < 


r — t 


,CT 


Proof It follows from the definition of At^r that Tt^r is the solution of the SDE 


dTt^s = -/srt,sdWs, 7s G dg{s,0), s G [f,r], Tt^t = 1- 

Using Ito isometry, we obtain the chain of relations 

\\rt,r-i\\= ril7sAsfds< r \hsf\\rt,sfds< r \\jsf{i + \\rt,s-if)ds. 

Jt Jt Jt 

If is a uniform upper bound on the normofthe subgradients of ^( 5 , 0 ) we deduee that ||ri 
As, s [t,r], where A satisfies the ODE: -^Z\s = n(l + Z\s), with At = 0. Consequently, 


The convexity of the exponential function yields the postulated bound. 


□ 


3 The Risk-Averse Control Problem 


3.1 Problem Formulation 


Our objective is to evaluate and optimize the risk of the cumulative cost generated by a diffusion 
process. 

On the filtered probability space (17, IF, P, F), we consider control processes u : [0,T]x ^ U 

such that n(-) is F-adapted, where U C is a compact set, and a diffusion process under any such 
control with initial time f G [0, T] and state x G M”: 


r dxl’^’^ = b{s,x 

\ = X. 


t,x-,u 
s ? 


Us) ds + a{s, Us) dWs, 


s G [t,T], 


( 6 ) 


Here, b : [0, T] x x f7 —R" and a : [0, T] x R"^ xU ^ R”^*^ are Borel measurable functions. 
We also introduce the cost rate function: a measurable map c : [0, T] x R'^ x ^ R, and the final 
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state cost: a measurable function ^ —)• R. Therefore, the random cost accumulated on the 

interval [f, T] for any t E [0, T] can be expressed as follows: 

^t,Tiu,x) := + a.s.. (7) 

Assumption 3.1. A constant iT > 0 exists such that, for any s E [t,T] and (xi,rti), {x 2 ,U 2 ) E 
R"' X U, the functions b, a, c, and dr satisfy the following conditions: 


| 6 (s,Xl,Ul) - b{s,X2,U2)\ + |(T(s,Xl,ni) - a{s,X2,U2)\ + |c(s,Xl,Ul) - c(s,X2,U2)| 

< K[\xi - X2I + \ui - U2I), 

| 6 (s,xi,ni)| + |fj(s,a:i,ui)| + |c(s,xi,tti)| + |>^(xi)| < K {1 + |xi| + |ni|). 

Under Assumption I3.11 the controlled diffusion process ® has a strong solution and the cost 
functional is square integrable. 

We define the control value function as follows: 


V^{t,x) := pl.j.[it,T{u,x)], a.s., (8) 

where T]’ ^ system of (^-evaluations discussed in section [T2] Using Definition 12.51 we 

can express the control value function as follows: 

u“(f,x)= it,T{u,x) + r - r zi^^-^^dWs 

Jt Jt 

= dr{x!f^'^)+ r r 

Jt Jt 


where solve the following BSDE: 

_dyqx;n ^ [c{s, ,Us) + g{s, ds - 4’^’“ dlU„ S E [f,T], 


Y^,x;u ^ 


t,x;u\ 


(9) 


Equivalently, U“(f,x) = 

If Assumptions 13.1 1 l2Al and l2.7l are satisfied, then for every {t, x) E [0, T] x the BSDE Q 
has a unique solution E S^[f,T] x (see, Peng 1331), and, therefore, the 

control value function is well-defined. 

In this way, the study of a risk-averse controlled system has been reduced to the study of con¬ 
trolled forward-backward stochastic differential equations (EBSDE). Such systems ware extensively 
studied by Ma and Yong in 1261 : other important references are lUl |3l] HU SSI- In our case, the 
EBSDE is decoupled, that is, the solution of the backward equation does not affect the forward 
equation, which substantially simplifies the analysis and allows for further advances. 

Notice, when the driver g = d, the control value function lU reduces to the expected value 
of ©. The risk-aversion is incorporated if other other drivers satisfying Assumption 12.71 are con¬ 
sidered. By the comparison theorem of Peng l30l . if gi is dominated by p 2 ^ i-e-, gi < 52 . then 
Pt^T(^t,T{u, x)) < pf^rp{Ct,Tiu, x)) almost surely; the larger the driver, the more risk aversion in the 
objective functional. Eor example, if we use gi {t, z) = k\z\, and p 2 (f. z) = k|2:+ |, with k > 0, then 
gi dominates g 2 . 
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3.2 Risk-Averse Dynamic Programmin g and Hamilton-Jacobi-Bellman Equations 


We now proceed to the control problem. We define the admissible control system as in Yong and 
Zhu ETl p. 177]). 

Definition 3.2, '^[t,T] is called an admissible control system if it satisfies the following conditions: 

(i) (17, IF, P) is a complete probability space; 

(ii) {V17(s)}s>t is an d-dimensional standard Brownian motion defined on (17, P) over [t,T] 
and F* = (7^)s6[t,T]> where Fl = a{{Ws;t < s < T) U and is the collection of all 
P-null sets in T; 

(iii) w : [s, T] X 17 ^ 17 is an {Jf}s>t-adaptedprocess with E |usp ds < +oo; 

(iv) For any x G the system da-® admits a unique solution (X, Y, Z) on (17, F, P, F*). 

The optimal value function V : [0, T] x R" —> R is defined as follows: 


V(t,x) = inf l/“(f,x). 

ueU[t,T] 


( 10 ) 


The weak formulation of a risk-averse control problem is the following: given (f, x) G [0, T) x W^, 
find u* G hl[t, T] such that 

V^*(t,x)= inf y“(t,x). (11) 

u€U[t,T] 

We can now formulate the dynamic programming equation for our control problem. 


Theorem 3.3 (Bill. Suppose Assumptions I3.il 12.41 and \2 . 71 are satisfied. Then, for any (t, x) G 
[0, T) X R” and all r G [t, T], we have 


V{t,x) 




Us)ds + V{r,X*p^’^) 


( 12 ) 


For a £ U we define the Laplacian operator L“ as follows: for w G C|^’^([0, T] x R"^) and 
{t, x) G [0, T] X R", 


1 

[lL°‘w]{t,x) =dtw{t,x)-£ ^ 

* 4=1 


n 

X, a)a{t, X, x) + '^bi{t, x, a)dxiW{t, x). 

i=l 


On the space C|^’^([0, T] x R”), we consider the following equation 

min |c(f, x, a) + [L“u] (f, x) + g(t, [D^v ■ <T“](t, x)) | = 0; x) ^ [0; ^ 1^'") (13) 


with the boundary condition 

v{T,x) = F{T,x), xGR*^. (14) 

We call (fT3]) - (fT4l) the risk-averse Hamilton-Jacobi-Bellman equation associated with the controlled 
system ® and the risk functional Q. It is a generalization of the classical Hamilton-Jacobi- 
Bellman Equation with the extra term p(-, •) responsible for risk aversion. In the special case, when 
p = 0, we obtain the standard equation. 

The following two theorems can be derived from general results on fully coupled forward- 
backward systems in GSlI . For decoupled systems, a direct proof is provided in Il43l . 




Theorem 3.4. Suppose Assumptions \3.1\\2.4\ and \2.7\ are satisfied; in addition, the functions b and 
a are bounded in x. Then the value function is a viscosity solution of the equation (I13I) - (I14I) . 

It is clear that ifV € ([i, T] x R”) then it satisfies ([T3]) - ([T4l) . We can also prove the converse 

relation {verification theorem). 

Theorem 3.5. Suppose the assumptions of Theorem 13.41 are fulfilled and let the function K € 
X satisfy ([T3l)- (fT4]) . Then K{t,x) < V^{t,x) for any control u{-) G U and all 
{t,x) G [0, T] X R” Furthermore, if a control process u*{-) G U exists, satisfying for almost all 
(s, oj) G [0, T] X 17, together with the corresponding trajectory , the relation 

< G arg min {c(s, , a) + L“i^(s, ) + g{t, [V^K • a“] (t, )) |, (15) 

then K{t, x) = V{t, x) = {t, x) for all {t,x) G [0, T] x R"'. 

4 Piecewise-Constant Control Policies and the Perturbed Problem 

Let hf G (0,1] be a time discretization step. We use the square to simplify further analysis. 

Definition 4.1. For any h G (0,1] and t G [0, T), let lAj^ be the subset of IT consisting of all F- 
adaptedprocesses ut which are constant on intervals [f, t + /i^), [f + f + 2h?), ..., [f + khf ,T], 
where T — h? <t + kh? < T. 

We define fhe corresponding value funclion 14 : [0, T] x R” —R as follows: 

Vh{t,x)= inf l/“(f,x). (16) 

u(-)eul 

We assume a sfronger condifion fhan Assumpfions 12.41 and 13.11 

Assumption 4.2. Let p,{t, x, z, a) stand for a{t, x, a), b{t, x, a), c{t, x, a), and t^(x)|l We assume 
that a constant K exists such that 

(i) For all a, a\,a 2 G LI, x, xi, X 2 G R”, z, Z\,Z 2 G R'^ we have 

\p{t,x,z,a)\ < K, 

\p{t,xi,zi,ai) - p{t,y,Z 2 ,a 2 )\ < K (|xi - X 2 I + | 2 ;i - Z 2 I + |ai - a 2 |); 

(ii) For all a G U, s,t £ [0, T], x G R"', z G R"^, we have 

\p{t, X, z, a) — p{s, X, z,a)\ < K\t — . 

^We sometimes write yP instead of /r(a) 
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By general results on forward-backward systems, the system (O, d?]) and (|9l) has a unique solu¬ 
tion and thus both functions: V in (fTOl) and I 4 in Cll), are well-defined. In particular, they are both 
deterministic. We focus on the difference between the value functions V and I 4 . 

The idea is is to embed the original control problem into a family of time and space perturbed 
problems, and then obtain a smooth approximation of the value function by means of an integral 
regularization (mollification). 

Let B = {(r, C) G R, x : r € (—1,0), |C| < 1}- Consider a time t € [0,T] and time 
instants ti = t + ih?, i = 0,1,... ,k and = T. For a piecewise-constant control Ug = Ui, 
s € [fi, fi+i), i = 0,1,... ,k, and perturbations f3i € R, i = 0,1,..., A:, we define fhe perfurbed 
confrolled FBSDE sysfem: 

dXg = b{s + e^Ti,Xs + eCi, a*) ds + a{s -f e^Ti,Xs + eQ, at) dWg, (17) 

dy^ = [c(s -h e'^Ti, Xg + eCi, a^) -f g{s -f Zg)] ds - Zg dW*, (18) 

s G [ti,ti+i), i = 0,l,...,k, 

wifh a fixed e > 0, wifh fhe inifial condition Xt = x, and wifh fhe final condifion = <P{Xt)- 
The process VF is a Brownian motion. We assume here fhaf h{t, x, a) = 6(0, x, a), a{t, x, a) = 
(t(0, X, a), c{t, X, a) = c(0, x, a), and g{t, z) = g{0, z) for all t G [—e^, 0]. 

We consider fhe following discrefe-fime optimal confrol problem associafed wifh fhe sysfem 
dnil-Cll). Af each fime ti, we selecf a confrol value ai and a perfurbafion /J* G B. The sysfem 
evolves fo fime ti+i, when new confrols ctj+i and /3i+i are selecfed. The objecfive of fhe confroller 
is fo make Y) fhe smallesf possible. From now on, we use a and /3 fo represenf fhe random sequences 
CKj and /3j, for i = 0,1..., k. 

Lemma 4.3. Functions : {to,ti,... ,tk} x R" —^ R exist, such that for all Xi G R", if the 
system ([I7l)-([l8]l starts at time tifrom Xt^ = Xi, then Yt^ = Xi). Moreover, 


V°‘'^{ti,Xi) = pI 


ti+£^Ti,ti+l+£^Ti 


r-q+i+e "Ti 


hi+£'^Ti 




+ - ec*) 


. (19) 


Proof. Wifh Wg ~ Wgj^g. 2 .,. for s G [U, fj+i] direcfly from fhe equations ([TtI) and ® we obfain: 

se[F,ti+f, a.s. (20) 

Wifh fhis subsfifufion, fhe BSDE (fT^ at s = F is equivalenf fo: 


Zti — Yti+i + 


rti+-i+£^Ti 


I ti+£^Ti 




r-q+i+e^Ti 


ZgdWg 


I ti+£^Ti 


= 

^ti+£^Ti,tiJ,.l+£^Ti 


fti+l+£^Ti 0 , ~ 

/ c(s, n,Xi+£(:i;ai ^ 

J ti+e^Ti 
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By definition, Yr = <P{Xt)- Supposing for some j, proceeding back¬ 
wards in time we conclude from the last equation that we can write Yt^ = for some 

function •). We can thus write the recursive relation 


= p\ 




J U+e^Ti 


Substitution of (1201) proves ([T9l ). 


□ 


Using this recursive relation, we define the value function of the optimally perturbed prob¬ 
lem Vh,s{ti,Xi) at each time ti and the corresponding state Xi as follows. At tk+i = T we set 
Vh^i;{T, xt) = ^{xt), and then, proceeding backwards in time. 




inf inf pf , 2 * ,2 

oii^U bM+i+s n 


rU+i+e^n 

/ c(s,A:**+" 

J ti+e^Ti 


. Jti+e^T, 


“h ^,£(^2-1-1) 


^ti+e^Ti,Xi+eCi',<y-i 
I 2 

U+i+e^Ti 



This construction can be carried out for every t E [0, T] and the resulting points ti = t + ih?, thus 
defining a function Vh^e ■ [Oj ^ K- which satisfies the relation 




f 


t+h'^+e^T 

I c(s,X^+"'^’^+<’“%ai)ds 

t+e^T 

t+e^T,x+eC-,ai 


+ {t + h^ - <) 


( 21 ) 


Iff E {T — h‘^,T) we replace f-|-with T in the above equation. The function 14 ^(f, x) represents 
the optimal value of the perturbed problem starting at time t from the state x and proceeding with 
piecewise constant controls and perturbations on intervals of length (except, perhaps, the last 
one, which ends at T). Let us stress that the perturbations are treated as additional controls in this 
construction. 

We now present a number of useful estimates from Krylov 12^ . 

Lemma 4.4. For t E [0, T), x, y E M”, and a E denote by the solution of (1171) and by 

the solution of (O, with the initial state x E M” at time t. Then 


E 

sup 

Yt,x;a,P 

\rs,x]a 12 
“ \ 

< iVe^^e^ 


(22) 


L t<s<T 






E 

sup 

Xt,x-,af 

\ 

< iVe^^lx 

CN 

1 

(23) 


L t<s<T 






E 

sup 

■^s 

X^,y,af\‘^ 

\ 

< Ne^^\t 

- 

(24) 


L t<r<T 







for N > 0 depending on (K, d, n) only. 
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The proof is by using Theorem 2.5.9 in E^ . These estimates can be used to derive the following 
bounds. 


Lemma 4.5. A constant N exists, depending on (K, d, n) only, such that: 

(i) For t G [0, T] and x G we have | x) — 14(f, x)| < Ne^'^e, 

(ii) Fort,r ^ [0,T], and x,y € IRF, we have\Vh^£{t, x) — Vh^£{r,y)\ < Ne^'^{\x — y\ + \t—r\^). 

Proof. For fixed a, recall that 


= pIj. 


c{r,Xp^'^^,ar)dr + <F{xtf'^) 


^h,f = Pt,T c(r, ar) dr + <F{x!f'’^’^) 

By standard estimates for BSDE and Lemma l4Al we have with some 7 > 0 depending on (iT, d, n). 


\V^{t,x) - v^f{t,x)\ < E 

rT 


+ E 


Jt 


Ic(r, ,ar)- c(r, , a,) |dr 


< x2g7(T-t)Er IxyP - sup 1^ < Ne^'^e. 


sup 

t<T 


The first assertion follows. For (ii), we observe that 


- r"f (r.s)! < Kf(t,x) - V«f(t,y)\ + \V«fit,y) - V«f(r,y)\. 

Similar to the proof for (i), by applying the second and third inequalities of Lemma |4Aj we have 

\K,f (*> - K,f (^> y)| ^ Ne^^\x - yp, 

\^hf (^,y) - ^h,f iFy)\ < Xe^^|f-r|5, 

which implies the postulated estimates. □ 


5 Mollification of the Value Function 

We now introduce an integral transformation of the value function. We take a non-negative function 
p G C°°{B) with p{t, () dr dC = 1, called a mollifier. For e > 0, we re-scale the mollifier as 
Pei'f: C) = p{t!, C/^)> we introduce the following notation of the convolution of the 

function 14 ^ with the re-scaled mollifier: 

Vh,e{t,x) =\yh,et^Pe\{t,x) = [ Vh,eit - e'^T, X - eOpir, () dr d(, 

Jb 
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defined as 


where t £ [0, T — e^] and x E R”. We shall need an estimate of the seminorm ||i4^e||2 ^ 
follows: 


m||2 ^ = sup \w{t, x)| + sup a;)|| + sup \\T>l^w{t, x)|| + sup \dtw{t, x)| 


(t,x) 


vl^w{t,x)-Vl^w{s,y)\\ 


{t,x) 

\dtw{t,x) - dtw{s,y)\ 


+ sup - p— ^^sup u I I I I ■ 

{t,x),(s,y) r '®l I® y\ (t,x),(s,y) r '®l “I" 1^ ^1 

In the formula above, we use and to denote the gradient and the Hessian matrix, and the 
supremum is always over (f, x), (s, y) E [0, T — e^] x R^. 


Lemma 5.1. If e > h, then 


Vh,e 


< Ne^'^e ^ and 


2,1 


Vh,e - Vh, 


< Ne^'^e. 


Proof. By elementary properties of the convolution. 


d 


^) = ^ ( ^h,e * Pe ) (L x) = ( 14 ,^ ) (f, x) 


= e 


dt 

-2 


[ Vh,e{t 

Jb 


d 

- eV, X - £C,)—ip{T, C) dr dC 


Thus, due to Lemma l43l iii. 




= e 


-2 


d 

- £^T,x - £C,)—ip{T,C,) dr dC, 


= e 


-2 


/ Vh,e{t 
JB 
r 

[Vh,eit-£‘^T,x-£C)-Vh,s{t,x)]—y?{T,C)dTdC 


< 2Ne^'^£-^ 


IB 


Pix, C) 


dr d^. 


We can thus increase N to write for all e > 0 the inequality 




< Ne^^e-^. 


Similarly, after redefining N in an appropriate way, 
d 


dx 


■Vh, 




d‘^ 


-Vh, 


^^2 


+ 


d3 




y 

dtdx^ 

+ 

0 


dx^dxldt^^’^ 

d3 


dx^dxl ' 




dx^dxldx^ 


Vh, 


It follows that 
d 


d 


< Ne^^£-^. 


52 


^Vh,e{t,x) - ^14,£(s,y) <\t- s||^i4,£ ^ + iV|x - y\ 

„NTu „U-3 , l^J„NT^ 


92 


dx^dt 
-2 


Vh,e 


<Ne^^^\t-s\£-^+ Ne^^^\x-y\e 
= Ne^^e-^{\t - s|e-^ + |x - y\). 
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The last expression is less than Ne^'^e ^(|i — s |2 + |x — 7/|)if|t — s| < e^. On the other hand, if 
|f — s| > e^, then 


(9 ^ ^ f> / \ 

-^yh,eit,x) - -^Vh,s{s,y) 


< 2 




< Ne^^s-^ < iVe^^e-2(|t-s|i + \x-y\). 


Hence the inequality in question holds for all f, s € [0, T]. In the same way one gets that 

^ Vh,s(t,x} - gjg^j yh,s(s,y) < Ne-^\t - s\^ + \x - y\). 


dx'^dx^ 


This proves the first inequality in the assertions. 

To prove the second one, we notice that Lemma 1431 yields 

\yh,e{t,x) -Vh,e{'>^,x)\ < [ \Vh,e{t,x) -Vh,e{t - £‘^T,X - £C)\(p{T,C)dTdC <2Ne^^£. 

Jb 

We can thus adjust N, if needed, to establish the second estimate for all e > 0. 

We can now establish a dynamic programming bound for the mollified value funclion. 


□ 


LemniB 5.2. Suppose A^ssuifiptioHs 12. 7\ cmcl \4.2\ cire satisfied. Then for all x G IR.^, fG[0,r-e2_ 
hf], and all a we have 


Vh,e{t,x) 


rt+h? 

c{s, a) ds + 


+ Ne^^h^£, (25) 


where N is a constant independent on h, £, and T. 

Proof. Fixing /3 = (r, C) on the right hand side of (|2T1) . for every a € U we obtain the inequality 

ft+td+e^T 


yh,sit,x) < Pj_,_g2.r,t+/l2+£27 


Ut+e^T 


c{s, a) ds+Vh,e{t+h\xl+l,::%^^’^-£C) 


Since t < T — £^ — h?, we can substitute t — for t and x — £C for x. We obtain 


Vh,e{t-£\,X-£C) 


pt-\-h ^ 

c(s, a) ds + 14 ,+ h^ - e^r, - £() 


By virtue of Theorem 12.81 the risk measure pi [•] is subadditive, and thus 

rt+h^ 


Vh,e{t-£\x-£C) <plt+h^ 

+ Ppi+h^ 


c{s, a) ds + Vh^t + /i^ 

Vh,s {t + h^- eV, - £C) - Vh,e {t + h^ x;;^“) 


( 26 ) 
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The last term on the right hand side of (1261) can be equivalently bounded by using the dual represen¬ 
tation of the risk measure [•]. We can thus write the following chain of relations: 


rP 


Vh,e{t + h^- - eC) - Vh4t + 


= sup Ej 






r 


Vh,e{t + h^- eV, - eC) - Vh,,{t + h\ 


= E. 


Vh,e[t + h^- e\ - eC) - Vh,e{t + X;;^“) 

(r - 1) (Vh,s{t + h^- e\ X;;^“ - eC)) - Vh,e{t + 

Vh,e{t + h^- ^l+h^ - O - 


+ sup Et 
r<^A,t+h2 


<Et 


+ sup ||r-l|| Vh,s{t + h^-e\xl’_^j;^-eC)-Vh4t + h^Xf^';;^) 


r&A. 




pit+h^ 

<Et 


Owing to Corollary 12. 101 and Lemma [57TT ii). we obtain the estimate 

X(t + - eQ - VnAt + h\x‘AP) 

k.,,(t + ft" - e"T, x\AA - e() - VuAt + ft", X‘AP) 

Substitution for the last term in (1261 ) yields 

Vh,e{t-e^T,X-£C) <plt+h^ 


+ Ne^^h^e. 


/ t+fl 

c(s, X‘’"’^ a) ds + {t + /i^ Xj;"^) 


+ Et 


+ h^- e\xXC. - eC) - Ke[t + 


ANe^^h^e. (27) 


We now multiply both sides of (1271) by (/j(t, C) and integrate over B. By changing the order of 
integration in the expected value term of (127] ) we observe that 

[ Ei \Vh,e {t + h^- e\ X;;^“ - eC) - Vh,e {t + h^, xj;"^) 1 ifir, C) dr dC 

J B 


= Eft 


Ub 


yh,e{t + h^ - e^r,x*)^^ - eC) - Vh,e{t + j (^(r, C) drdC 


= 0 . 


Other terms on the right hand side of (1271) do not depend on (r, Q and thus (1251) follows. 


□ 


6 Accuracy of the Approximation 

We can now investigate the effect of the size of the discretization interval, on the accuracy of the 
value function approximation. For simplicity of presentation, we write a°‘{s, x) for a{s, x, a) and 
c"(s, x) for c(s, X, a) 
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Lemma 6.1. For any w € C^’^([0, T] x R"^), any 0 < t < 6 < T, and all u{) we have: 


w{t,x) = pIq 


i 


n,) ds + w{e, - c 


t,x;u\ 


where 




(28) 


(29) 


Proof. For any n(-) €^U, we apply Ito formula to m(s, X. 




p6 p6 

w{e,Xl’^’^)-w{t,x)- [L“*n;](s,X^’^’“)ds= / [V^w a^^]{s, dW^. 

Jt Jt 

Subtraction of g(^s, [D^w ■ a'^”] {s, Xi’*’“)) ds from both sides and evaluation of the risk on both 
sides yields 


Pt,e 


w{e,Xl’^’'^) - wifx) - j ([L“«m] {s,Xm+9{s, P.w ■ a“^](s,X‘’"^“))) 


ds 


- 

~ Pt,e 


_ J t 


[V,wa^‘]is,Xl’^’^)dWs- / g{s,[V,wa^^]is,Xl’^’^))ds 


(30) 


The risk measure on the right hand side of (l30l) is the solution of the following backward stochastic 
differential equation: 

Yt,x;u^ [' g{s,[V^wa^^]{s,Xl’^n)ds 

Jt Jt 

p9 p9 

+ / 5(s, ds - / Zl'^’^dWs. 

Jt Jt 

Substitution of = [VxW ■ (t“'’](s, xl'^’^) yields = Q. By the uniqueness of the solution 

of BSDE, the right hand side of (l30l) is zero. Using the translation property on the left hand side of 
(l30l ). we obtain 

r ([L““n;] (s,X 9 "’“) +5(s, [D^w • a"=](s,X*’"-“))) ds + ’“) 


w{t,x)= pIq 

This is the same as 

The integral in (1291) can be bounded by the following lemma. 

Lemma 6.2. For all f € [0, T — — e^], x G R” and all a ^ U, we have 

[c“ +]L‘^Vh,e]it,x) +g{t, [D^Vh^e ■ ’ 

where the constant N does not depend on h, e, and T. 


□ 


(31) 
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Proof. By Lemma l5^ for every a ^ U we have 

rt+h'^ 


Vh,e{t,x) 


/ t+n- 

c(s, a) ds + Vh4t + 


+ Ne^^h^e. 


Using the translation property of t+/^2^ we obtain the inequality: 

/ ft+h^ 


Pt,t+h'2 


+ + I > -Ne^^hh. 

-2 Yt,x-,a\ 


Since Vh^e € C^’ ([f, T —e^] x R,"-), we can evaluate the difference Vh,£{t + h‘^, — Vh^eif, x) 

by Ito formula between t and t + hf'- 

/ t+h‘^ pt+h‘^ 

[L“U;,,,](s,X‘’"’“)ds+ [V^Vh^,-a^]{s,Xi’-ndWs. 

Substitution into the previous inequality yields: 


Pt,t+h‘^ 


rt+h^ pt+h^ \ 

[c“+L“i4,,](s,X*’"’“)ds+y^ [V^Vh,e-<7^]{s,XP^n^Ws\ >-Ne^^h^e. 

(32) 

The evaluation of the risk measure amounts to solving the following backward stochastic differential 
equation: 


pt+h^ pt+n," 

j [c“ + lL^Vh,e] {s, Xl’^n ds + J [V^Vh,e • (s, Xl’^n dfU, 

+ / 9{s,Zs)ds - / 

Jt Jt 

The equation has a unique solution: 

= [D.Vh^e • a^]is,Xi’^n, t<s<t + h\ 

Yt = [[c^+lXVh,e]{s,Xl’^n+9{s. [DxVH,e ' a“] (s, } 

We can thus write the inequality 

Yt < ([c“ +L“U;,,,](t,x) + (7(f, [D^Vh,e ■ ^T“](f,x))) 


rt+h^ 


Yf = 




ds. 


+ h^ 

max 



+ h^ 

max 


t<s<t+h‘^ 


[c“ +L“U;,,,](s,X*’"-“) - [c“ +L“U;,,,](t,x) 
g{s, [D^Vh,s ■ - g(t, [T»xl4,£ • cr"](f,a:)) 

The last two terms can be bounded by Ne^'^hf/e^, owing to Assumption 14.21 and Lemma l5TT] 
Combining this inequality with (l32l ) and dividing by h?, we conclude that for all a € (7 the estimate 
(OTl) is true. □ 
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We are now ready to prove the main theorem of this seetion. 

Theorem 6.3. Suppose Assumptions \2.7\ and \4.2\ are satisfied. Then for any t € [0,r], x G R” and 
h G (0,1], we have 

\V{t,x) - Vh{t,x)\ < Ne^'^h^, 


where the constant N depends only on (K, n, d). 

Proof. We set e = /is and organize the proof in three steps. 

Step 1: If t G [T — /i^ — T], then for any u{-) and some constant C we have, 


\V^{t,x) - ^(x)| < |c(s,X*---“,n,)| ds + - ^(a:)| 


<K{h^ + e^)+KKt,^ \X, 


t,X,U 


— X\ 


< K{1 + C){h^ + e^) < 2K{1 + C)h 


In the above estimate we also used the fact that the solution of the forward-backward system I©-® 
is Lipschitz in the initial condition If26ll . The same reasoning works for Vjf, and thus 

\V^{s,x)-d>{x)\<2K{l + C)hl 


We can, therefore, for some constant N write the inequality 

\V^{t,x) - V,^{t,x)\ < Ne^'^hX 


The optimization over u will not make it worse, and thus our assertion is true for these t. 

Step 2: Consider t G [0, T — /i^ — e^]. By Lemma lOl for all «(•) on[t,T — h? — e^], we 
have 

/ pT-h^-e"^ 

yh,e{t, a:) < plr-h^-e^ ( Us) ds + 

where, owing to Lemma l6^ 

C = ([c""+L““I4,e](s,a;)+ 5 (s, ds >-iYTe^'^ . 

These relations, the mono tonicity of the risk measure, and Lemmas 14.51 and |57T] imply the estimate 


Vh,e[T-h^-e^ 


t^X ^11 \ > 


Vhit^x) < 


rT-h^-s^ 


I c{s,Xl’^’^,Us)ds 

+ Vh{T-h^- e^,_,2)) + NTe^^ 
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In view of the inequality established in Step 1, using e = /is, and redefining N appropriately, we 
obtain the following inequality: 

Vh{t,x) < ' c(s,X*’"’^n,)ds + F(^-/^2-£^X^^’“,_^,)^+iVe'^^/li 

(33) 

Step 3: We apply the dynamic programming equation ^Y2\ to the right hand side of (l33l) to 
conclude that 


Vh{t,x)< inf / c(s,X‘’"^’“,u,)ds 

u{-)&A ^ \Jt 


+ I/(r-/i2-e2^X^^’“2_£2)) +/Ve^^/i5 =V{t,x)+Ne^'^h^, 


as required. 


□ 
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